这项研究提出了一个多模式的机器学习模型,以预测ICD-10诊断代码。我们开发了单独的机器学习模型,可以处理来自不同模式的数据,包括非结构化文本,半结构化文本和结构化表格数据。我们进一步采用了合奏方法来集成所有模式特异性模型以生成ICD-10代码。还提取了主要证据,以使我们的预测更具说服力和可解释。我们使用医学信息集市进行重症监护III(模拟-III)数据集来验证我们的方法。对于ICD代码预测,我们的表现最佳模型(Micro-F1 = 0.7633,Micro-AUC = 0.9541)显着超过其他基线模型,包括TF-IDF(Micro-F1 = 0.6721,Micro-AUC = 0.7879)和Text-CNN模型(Micro-F1 = 0.6569,Micro-AUC = 0.9235)。为了解释性,我们的方法在文本数据上实现了JACCARD相似性系数(JSC)为0.1806,在表格数据上分别获得了0.3105,训练有素的医生分别达到0.2780和0.5002。
translated by 谷歌翻译
Object movement identification is one of the most researched problems in the field of computer vision. In this task, we try to classify a pixel as foreground or background. Even though numerous traditional machine learning and deep learning methods already exist for this problem, the two major issues with most of them are the need for large amounts of ground truth data and their inferior performance on unseen videos. Since every pixel of every frame has to be labeled, acquiring large amounts of data for these techniques gets rather expensive. Recently, Zhao et al. [1] proposed one of a kind Arithmetic Distribution Neural Network (ADNN) for universal background subtraction which utilizes probability information from the histogram of temporal pixels and achieves promising results. Building onto this work, we developed an intelligent video surveillance system that uses ADNN architecture for motion detection, trims the video with parts only containing motion, and performs anomaly detection on the trimmed video.
translated by 谷歌翻译
Test-time adaptation is the problem of adapting a source pre-trained model using test inputs from a target domain without access to source domain data. Most of the existing approaches address the setting in which the target domain is stationary. Moreover, these approaches are prone to making erroneous predictions with unreliable uncertainty estimates when distribution shifts occur. Hence, test-time adaptation in the face of non-stationary target domain shift becomes a problem of significant interest. To address these issues, we propose a principled approach, PETAL (Probabilistic lifElong Test-time Adaptation with seLf-training prior), which looks into this problem from a probabilistic perspective using a partly data-dependent prior. A student-teacher framework, where the teacher model is an exponential moving average of the student model naturally emerges from this probabilistic perspective. In addition, the knowledge from the posterior distribution obtained for the source task acts as a regularizer. To handle catastrophic forgetting in the long term, we also propose a data-driven model parameter resetting mechanism based on the Fisher information matrix (FIM). Moreover, improvements in experimental results suggest that FIM based data-driven parameter restoration contributes to reducing the error accumulation and maintaining the knowledge of recent domain by restoring only the irrelevant parameters. In terms of predictive error rate as well as uncertainty based metrics such as Brier score and negative log-likelihood, our method achieves better results than the current state-of-the-art for online lifelong test time adaptation across various benchmarks, such as CIFAR-10C, CIFAR-100C, ImageNetC, and ImageNet3DCC datasets.
translated by 谷歌翻译
Only limited studies and superficial evaluations are available on agents' behaviors and roles within a Multi-Agent System (MAS). We simulate a MAS using Reinforcement Learning (RL) in a pursuit-evasion (a.k.a predator-prey pursuit) game, which shares task goals with target acquisition, and we create different adversarial scenarios by replacing RL-trained pursuers' policies with two distinct (non-RL) analytical strategies. Using heatmaps of agents' positions (state-space variable) over time, we are able to categorize an RL-trained evader's behaviors. The novelty of our approach entails the creation of an influential feature set that reveals underlying data regularities, which allow us to classify an agent's behavior. This classification may aid in catching the (enemy) targets by enabling us to identify and predict their behaviors, and when extended to pursuers, this approach towards identifying teammates' behavior may allow agents to coordinate more effectively.
translated by 谷歌翻译
The domain of joint vision-language understanding, especially in the context of reasoning in Visual Question Answering (VQA) models, has garnered significant attention in the recent past. While most of the existing VQA models focus on improving the accuracy of VQA, the way models arrive at an answer is oftentimes a black box. As a step towards making the VQA task more explainable and interpretable, our method is built upon the SOTA VQA framework by augmenting it with an end-to-end explanation generation module. In this paper, we investigate two network architectures, including Long Short-Term Memory (LSTM) and Transformer decoder, as the explanation generator. Our method generates human-readable textual explanations while maintaining SOTA VQA accuracy on the GQA-REX (77.49%) and VQA-E (71.48%) datasets. Approximately 65.16% of the generated explanations are approved by humans as valid. Roughly 60.5% of the generated explanations are valid and lead to the correct answers.
translated by 谷歌翻译
The geospace environment is volatile and highly driven. Space weather has effects on Earth's magnetosphere that cause a dynamic and enigmatic response in the thermosphere, particularly on the evolution of neutral mass density. Many models exist that use space weather drivers to produce a density response, but these models are typically computationally expensive or inaccurate for certain space weather conditions. In response, this work aims to employ a probabilistic machine learning (ML) method to create an efficient surrogate for the Thermosphere Ionosphere Electrodynamics General Circulation Model (TIE-GCM), a physics-based thermosphere model. Our method leverages principal component analysis to reduce the dimensionality of TIE-GCM and recurrent neural networks to model the dynamic behavior of the thermosphere much quicker than the numerical model. The newly developed reduced order probabilistic emulator (ROPE) uses Long-Short Term Memory neural networks to perform time-series forecasting in the reduced state and provide distributions for future density. We show that across the available data, TIE-GCM ROPE has similar error to previous linear approaches while improving storm-time modeling. We also conduct a satellite propagation study for the significant November 2003 storm which shows that TIE-GCM ROPE can capture the position resulting from TIE-GCM density with < 5 km bias. Simultaneously, linear approaches provide point estimates that can result in biases of 7 - 18 km.
translated by 谷歌翻译
Federated learning (FL) on deep neural networks facilitates new applications at the edge, especially for wearable and Internet-of-Thing devices. Such devices capture a large and diverse amount of data, but they have memory, compute, power, and connectivity constraints which hinder their participation in FL. We propose Centaur, a multitier FL framework, enabling ultra-constrained devices to efficiently participate in FL on large neural nets. Centaur combines two major ideas: (i) a data selection scheme to choose a portion of samples that accelerates the learning, and (ii) a partition-based training algorithm that integrates both constrained and powerful devices owned by the same user. Evaluations, on four benchmark neural nets and three datasets, show that Centaur gains ~10% higher accuracy than local training on constrained devices with ~58% energy saving on average. Our experimental results also demonstrate the superior efficiency of Centaur when dealing with imbalanced data, client participation heterogeneity, and various network connection probabilities.
translated by 谷歌翻译
While speech recognition Word Error Rate (WER) has reached human parity for English, long-form dictation scenarios still suffer from segmentation and punctuation problems resulting from irregular pausing patterns or slow speakers. Transformer sequence tagging models are effective at capturing long bi-directional context, which is crucial for automatic punctuation. Automatic Speech Recognition (ASR) production systems, however, are constrained by real-time requirements, making it hard to incorporate the right context when making punctuation decisions. In this paper, we propose a streaming approach for punctuation or re-punctuation of ASR output using dynamic decoding windows and measure its impact on punctuation and segmentation accuracy across scenarios. The new system tackles over-segmentation issues, improving segmentation F0.5-score by 13.9%. Streaming punctuation achieves an average BLEU-score improvement of 0.66 for the downstream task of Machine Translation (MT).
translated by 谷歌翻译
我们假设现有的句子级机器翻译(MT)指标在人类参考包含歧义时会效率降低。为了验证这一假设,我们提出了一种非常简单的方法,用于扩展预审计的指标以在文档级别合并上下文。我们将我们的方法应用于三个流行的指标,即Bertscore,Prism和Comet,以及无参考的公制Comet-QE。我们使用提供的MQM注释评估WMT 2021指标共享任务的扩展指标。我们的结果表明,扩展指标的表现在约85%的测试条件下优于其句子级别的级别,而在排除低质量人类参考的结果时。此外,我们表明我们的文档级扩展大大提高了其对话语现象任务的准确性,从而优于专用基线高达6.1%。我们的实验结果支持我们的初始假设,并表明对指标的简单扩展使他们能够利用上下文来解决参考中的歧义。
translated by 谷歌翻译
现代社会有兴趣由于复杂的相机的激增而捕获高分辨率和优质图像。但是,如果在计算机视觉任务中使用了此类图像,则图像中的噪声污染不仅较低,而且相反会影响随后的过程,例如遥感,对象跟踪等。高分辨率图像的时间处理受图像捕获仪器的硬件限制的限制。 Geodesic Gramian denoising(GGD)是一种基于多种噪声滤波方法,我们在过去的研究中介绍了该方法,它利用了Geodesics的Gramian Gramian矩阵的一些突出的奇异向量进行噪声滤波过程。 GDD遇到$ \ MATHCAL {O}(n^6)$时,GDD的适用性受到限制^2 $数据矩阵由单数值分解(SVD)实现。在这项研究中,我们通过用四种不同的单数矢量近似技术代替其SVD步骤来提高GGD框架的效率。在这里,我们比较集成到GGD中的四个技术之间的计算时间和噪声过滤性能。
translated by 谷歌翻译